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.Estimating  and  modeling  gene  flow  for  a  spatially  distributed  species. 


By 

T.  Burr  and  T.  V.  Kurien 
Department  Of  Statistics 
Florida  State  University 

Abstract 

This  paper  models  the  genetic  behavior  of  a  large  population  of  individuals 
which  is  divided  into  colonies.  We  are  studying  the  relative  frequency  of  an  allele 
.41  at  a  specific  locus  and  over  all  the  colonies.  The  effects  of  various  migration 
patterns  across  colonies  on  these  relative  freqencies  are  studied.  We  set  down  a 
joint  distribution  for  the  relative  frequencies  of  A1  at  the  colonies.  This  joint  dis¬ 
tribution  is  Gaussian  and  allows  us  to  estimate  parameters  that  describe  the  extent 
of  genetic  exchange  across  colonies.  These  parameters  are  called  migration  rates. 
The  Gaussian  models  fits  well  to  observed  data  and  is  very  easy  to  simulate.  The 
model  can  be  extended  without  much  difficulty  to  describe  various  mating  patterns 
across  colonies. 


1  Introduction 

Population  geneticists  have  many  models  to  study  the  effect  of  geographical  subdi¬ 
vision  on  the  evolution  of  a  species. 

Consider  a  large  population  of  individuals  of  a  particular  species  which  is  to  some 
extent  subdivided  into  colonies.  Complete  subdivision  means  that  each  colony  is 
isolated.  At  the  other  extreme  is  no  subdivision.  This  means  that  ah  adults  in  the 
entire  population  are  equally  likely  to  mate  with  all  other  adults  of  the  opposite  sex 
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in  the  population.  It  is  believed  that  many  species  follow  mating  patterns  some¬ 
where  between  these  two  extremes.  To  set  the  stage  for  the  mathematical  models, 
the  necessary  genetical  terms  are  collected  here.  Most  organisms  are  diploid,  having 
chromosome.-  in  pairs,  one  inherited  from  each  parent’s  gamete  (sperm  or  egg)  cell. 
The  genotype  of  a  diploid  individual  is  the  specification  of  all  of  its  chromosome 
pairs.  It  is  sometimes  sufficient  to  model  a  diploid  species  as  if  it  were  haploid. 
Haploid  individuals  have  only  one  of  each  type  of  chromosome.  We  think  of  a  chro¬ 
mosome  as  a  long  string  of  symbols  from  hie  four-letter  DNA  alphabet  representing 
the  four  different  nucleotides  of  DNA.  At  a  certain  place  on  a  chromosome  (referred 
to  as  a  locus )  is  a  meaningful  string  of  several  hundred  symbols  called  a  gene.  Typ¬ 
ically  there  are  many  loci  on  a  chromosome.  The  possible  messages  that  could  be 
written  are  called  the  alleles  of  that  locus.  Though  the  characteristics  of  an 
individual  depend  in  a  complicated  way  on  all  loci,  it  is  informative  to  study  a  single 
locus.  If  this  locus  is  a  string  of  200  letters  there  are  4200  alleles.  At  the  present 
time  it  is  common  to  detect  alleles  by  electrophoresis  which  typically  is  able  to  de¬ 
tect  only  a  few  alleles  from  the  essentially  infinite  number  of  possible  alleles  at  the 
DNA  level.  If  only  two  alleles  can  be  distinguished  at  a  locus,  say  Al  and  A2,  then 
the  three  possible  genotypes  for  that  particular  locus  are  A1A2,  A1A1,  and  A2A2. 
which  are  referred  to  as  heterozygote ,  homozygote ,  and  homozygote ,  respectively. 
Natural  selection  is  in  effect  if  the  different  genotypes  have  different  probabilities  of 
contributing  to  the  gene  pool  in  the  next  generation.  If,  for  example,  heterozygotes 
have  a  larger  probability  of  surviving  to  adulthood  and  contributing  to  the  gene 
pool  o:  the  next  generation,  then  heterozygotes  are  said  to  have  a  greater  fitness. 
An  allele  that  is  not  affected  by  selection  is  called  neutral.  If  an  allele  is  altered  by 
a  copying  error  or  environmental  stimulus,  the  resulting  allele  is  called  a  mutant. 
The  present  work  considers  loci  with  two  neutral  alleles.  Note  that  if  there  are  more 
than  two  alleles  at  the  locus  under  study,  then  the  alleles  could  be  grouped  into  two 
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Two  simple  models  that  represent  extremes  in  migration  habits  are  the  Island 
Model  and  the  Stepping-Stone  Model.  (Wright,  1931  and  Ivimura,  1953)  In  the 
Island  Model  all  colonies  are  equally  likely  to  exchange  migrants  with  all  other 
colonies.  In  the  Stepping-Stone  Model  each  colony  exchanges  migrants  with  only 
ns  nearest  neighboring  coMMc*.  These  two  extremes  will  be  considered  here  and 
in  the  context  of  the  Gaussian  Model  will  be  referred  to  as  the  full-neighbor  and 
4-neighbor  models. 

Ivimura  (1953)  proposed  a  migration  pattern  where  migration  is  from  nearby 
colonies.  This  is  called  the  Stepping-Stone  model.  In  the  two-dimensional  Stepping- 
Stone  model  each  colony  is  located  at  a  grid  point  in  an  n  by  n  lattice.  Identify 
the  location  of  a  colony  by  i  where  i  6  {(0,0),  (0, 1), ...  (n  —  l,n  —  1)}.  The  version 
studied  here  is  the  following:  Each  colony  maintains  a  constant  population  size  N , 
with  2  alleles  at  each  locus.  Since  there  is  a  finite  number  of  colonies,  a  joint  non¬ 
trivial  stationary  distribution  of  the  relative  allele  frequency  of  Al  in  each  colony 
is  possible  only  if  there  is  reversible  mutation  or  migration  from  a  constant  outside 
source. 

Assume  here  that  there  is  an  outside  source,  say  a  mainland  population  with 
constant  Al  allele  relative  frequency  pM.  Denote  the  relative  frequency  of  allele  Al 
in  the  colony  at  i  by  p;.  To  specify  a  particular  migration  pattern,  assume  that  the 
colony  at  i  replaces  a  fraction,  m,  of  its  population  with  immigrants  from  the  four 
neighbors,  {(id  —  l,i2)>(»"i  -r  ls»2)>(*is*2  —  l),(ti,*2  +  1)}  and  also  replaces  a  small 
fraction,  m/,  of  its  population  with  immigrants  from  a  large  mainland  population. 
Here,  addition  is  modulo  n  to  avoid  edge  effects.  For  example,  the  four  neighbors 
of  (n  —  l,n  —  1)  are  {(n  -  2,  n  -  1),  (0,  n  —  l),(n  —  l.n  —  2),  (n  —  1,0)}.  Assume 
that  all  individuals  produce  many  gametes  (sperm  or  egg  cells)  and  the  stochastic 
component  of  how  one  generation  leads  to  the  next  is  due  to  the  binomial  sampling 
involved  in  reducing  each  population  to  size  N. 

Attempts  to  obtain  the  steady-state  distribution  of  the  relative  frequency  of  al- 
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lele  A1  in  a  single  colony  have  not  been  successful.  However,  a  Beta  distribution  ap¬ 
pears  to  give  a  good  approximation  in  simulation  studies  (Maruyama.  1977).  Weiss 
and  Ivimura  (1965)  obtained  an  expression  for  the  stationary  correlation  between 
Pi  and  pj  without  attempting  to  approximate  the  joint  stationary  distribution. 

A  rather  different  approach  is  to  approximate  the  joint  steady  state  distribution 
of  {pi  :  i  e  (fi,i2)  :  0  <  t‘i  <  n  —  1,0  <  t2  <  n  -  1}. 

There  has  been  no  published  attempt  to  rpproximate  this  joint  distribution; 
however,  in  the  full-neighbor  case,  as  n  — >  oo,  one  could  use  standard  diffusion 
theory  to  arrive  at  an  approximating  stationary  distribution.  (See,  for  example, 
Mathematical  Population  Genetics.)  The  approximating  stationary  distribution 
would  in  that  case  be  greatly  simplified  since  one  could  take  the  p;  to  be  independent 
beta  random  variables,  with  parameters  determined  by  the  number  of  migrants 
exchanged  per  generation  and  the  mean  allele  relative  frequency. 

To  see  that  the  Stepping-Stone  model  gives  rise  to  a  Markov  Random  Field, 
write  pi  for  the  relative  frequency  of  allele  A1  in  colony  i.  Then  the  conditional 
distribution  of  pj  given  the  relative  frequency  of  A1  in  all  the  colonies  is  the  same  as 
the  conditional  distribution  given  the  relative  frequency  of  Al  in  only  the  neighbors 
of  i.  Since  0  <  p\  <  1.  it  will  be  necessary  to  transform  the  {p;}  in  order  to  use  the 
Gaussian  model.  However,  for  some  values  of  the  parameters,  the  Gaussian  model 
should  fit  the  raw  data.  Under  the  Gaussian  model,  to  be  given  in  section  2,  the 
conditional  distribution  of  A";  given  the  relative  frequencies  of  Al  in  its  neighbors 
will  be  Gaussian.  Empirically,  it  appears  that  a  transformation  will  be  necessary 
except  if  p  is  near  .5  and  the  number  of  migrants  exchanged  is  fairly  large,  say  2 
or  more.  The  choice  of  transformation  was  made  by  considering  that  the  marginal 
distributions  do  appear  to  be  approximately  beta.  Typical  transforms  from  a  beta 
distribution  to  approximate  normality  include  the  logit  and  probit  transforms,  which 
are  log(p/(l  —  p))  and  d,_:i(p).  respectively,  where  <!>  is  the  standard  normal  cumu¬ 
lative  distrubtion  function.  For  some  values  of  the  parameters,  these  transforms 


might  be  improved  by  first  raising  p  to  some  power.  Therefore,  the  Gaussian  Model 
has  been  fit  to  2  =  log(p/(l  —  p))  , 

x  =  log(pA/(l  —  pA)),  x  =  and  to  x  —  Good  values  for  A  seem  to 

be  1.5  to  2. 

2  The  Gaussian  Model 

Notation 

Let: 

L?n  =  {(21,22)  :  0  <  z’i  <  n  —  1, 0  <  i2  <  n  —  1}  be  an  n  by  n  array. 

No  be  a  neighborhood  of  0.  For  example,  one  could  take  Nq  to  be 
{(0, 1),  (1, 0),  (0,  —1),  (  —  1,0)}.  This  is  the  usual  4-neighbor  lattice, 
i  +  j  =  (21  ©  j ij  22  ©  j 2)  where  ©  is  addition  modulo  n. 

Points  in  L2n  are  2  component  vectors  genericallv  denoted  by  i,j,k,and  1.  Let  1 
denote  the  vector  (1,1)  €  L2. 

Assume  that  the  joint  stationary  density  of  A’j  is  the  multivariate  normal  (MVN) 
density: 

m  =  M!1"  (2~)~r*/7 exp{  — (x  ~  Z  Ni).  (2.1) 

L'ote  that  the  mean  of  x  is  pi  and  the  covariance  matrix  of  x  is  A-1. 

To  capture  the  nearest-neighbor  migration  patterns,  rewrite  /(x)  as: 

/(*)  =  Ml'"  (2^)-"=^  exp{  — 1/2  y  53  c(ejd,  j)(x,  -  xi+j)2  -  62/2  •£  (x,  -  p)2}. 

ie^jeA’o  ieU 

(2.2) 

Here,  c(9:/d.  j)  determines  the  amount  of  migration  among  neighbors.  Large 
values  of  8\  correspond  to  large  migration  rates  between  neighbors.  Large  values  of 
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#2  correspond  to  large  migration  rates  from  the  constant  outside  source,  or  to  large 
mutation  rates.  The  index  j  allows  for  the  migration  rate  to  depend  on  direction, 
and  d  is  the  number  of  neighbors.  Assuming  d  is  known,  one  goal  will  be  to  estimate 
6 1  and  6 2- 

From  (1.1)  and  (1.2)  it  can  be  shown  that: 


'4o,o  —  -4-i,i  —  #2  -f  2  Y 

■Aij+j  =  -(c(6>i/<f,j)  +  c(0i/d,}))  if  j  €  Aro  (2.3) 

'4i,i+j  =  0  if  j  d  AV 

The  eigenvalues  of  .4  are: 

A£  =  e2  +  2  Y  c(0i/d,j)(l  ~  cos(2tt  <  j,  k  >  /n)).  (2.4) 

j€Aro 

Although  A£  depends  on  n  we  shall  write  A£  as  Aj.  from  now  on  for  notational 
convenience.  Also,  iet  A  be  the  diagonal  matrix  with  diagonal  entries  Aj.. 
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3  The  Stepping-Stone  model 

For  the  Stepping-Stone  model  we  shall  assume  isotropic  migration,  by 

putting  c{8,/ 4,j)  =  c(6\/4,  —  j)  =  8 i/4.  The  Stepping-Stone  model  accounts  for 

local  migration  by  taking  A'o  —  {(0, 1),  (1, 0),  (0,  — 1).  (— 1, 0)}. 

Then  (2.3)  and  (2.4)  simplify  to: 

-4-0,0  =  -4-ij  =  62  4-  26, 

'4-i.i+j  =  —8\/2  Vi  £  L?n  ii  j  €  A'o  (3-1) 

Ai,i+j  =  0  if  j  ^  No 


-h  =  #2  +  #i((l  —  cos(2'^k1/n)  -f  (1  —  cos(2 ~h7/n)).  (3.2) 

The  model  accounts  for  long  distance  migration  (Island  model)  if  we  take 
A” 0  =  L?n  —  (G,0),  which  is  the  full-neighbor  version.  Then  (2.3)  and  (2.4)  simplify 
to: 


-4-0,0  =  -4i,i  —  82  +  S8,/(n2  —  1) 

-4i,i+j  =  -28, /(n2  —  1)  Vi  G  L\  if  j  6  I2  -  (0,0)  (3.3) 

Ak  =  82  -f  2n28,/(n2  —  1)  if  k  6  T2  —  (0,  0) 

^0,0  =  82.  (3.4) 

In  order  to  generate  data  from  the  density  (2.2),  first  generate  n2  independent 
standard  normal  random  variables,  say  Z  ~  N(0,J),  where  I  is  the  n 2  by  n2  identity 
matrix.  To  obtain  X  ~  A‘(0,  .4"1 ),  use  the  fact  that  if  Y  ~  A:(0.  X)  then  CY  ~ 
A  (0,  CTCr)  for  C  an  appropriately  dimensioned  matrix  of  constants.  Now  use  the 
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spectral  decomposition  of  the  covariance  matrix  A-1  in  the  usual  way  to  find  C  such 
that  CCT  =  A-1.  This  gives  the  following  prescription  for  generating  the  data.  The 
observation  at  the  location  1  =  (hJ?)  is  generated  by: 


A'i  =  p  -  n  2  J2  Zk  W 


cos(2~(<  l.j  >  A  <  k.j  >)/n) 

A 


(3.5) 


This  means  that  an  observation  from  the  steady-state  distribution  of  an  n  by 
n  array  requires  simply  the  generation  of  n 2  standard  normad  random  variables, 
and  performing  the  indicated  summation.  This  is  much  faster  than  simulating 
the  migration  and  reproduction  (random  sampling  of  gametes)  pattern  for  many 
(approximately  100)  generations  until  stationarity  is  reached. 

One  of  the  main  results  of  previous  work  with  the  stepping  stone  model  is  how 
the  covariance  between  po  and  p\  depends  on  the  dimension  of  the  habitat  (linear, 
in  the  plane,  or  in  three  dimensions),  the  migration  pattern,  and  the  migration  rate. 
Maruyama,  Kimura  and  Weiss  all  used  recursion  equations  to  model  the  way  that 

tier,  and  solved  for  the  stationary  covari- 
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ance,  cov(p0,Pi)-  An  attractive  feature  of  the  Gaussian  model  is  that  cov(A’o,A’i) 
is  easier  to  compute  than  it  is  when  working  with  the  recursion  equations.  From 
(3.5)  it  can  be  shown  that: 


/  v  v\  -4  cos(2/.  (<  j.  k  >  +  <  s.  k  +  1  >)/n) 

cov(A0,Ai  )  =  n  ^  1 - 7= - •  (3.6) 

keLl  j.s  CLI  \/Aj  As 

It  is  known  that  migration  rates  among  partially  isolated  colonies  need  not 
be  very  large  to  prevent  genetic  diversity.  However,  if  migration  tends  to  be  from 
nearest-neighboring  colonies  rather  than  the  entire  population,  then  there  is  greater 
potential  for  genetic  diversity  (Crow  and  Aoki,  19S2).  The  Gaussian  model  has  this 
same  feature.  One  way  to  see  this  is  to  solve  for  the  variance  of  the  equilibrium 
distribution  of  the  relative  frequency  of  allele  A1  in  any  of  the  identical  colonies. 
Let  X\:  be  given  by  equation  2.4.  Then  it  follows  from  the  spectral  representation 


of  the  variance-covariance  matrix  that  the  variance  of  the  relative  frequency  of  A1 
in  each  colony  is: 

var(X)  =  n~7  £  1/Ak.  (3.7) 

This  provides  an  easy  proof  that  var(X)  is  less  in  the  full-neighbor  model  than 
in  the  4-neighbor  model,  which  is  intuitively  expected.  To  see  this,  first  note  that 
A0,o  =  #2  in  both  the  full-neighbor  and  the  4-neighbor  models.  Next,  it  can  be 
shown  that  all  remaining  eigenvalues  are  the  same  in  the  full- neighbor  case,  but  are 
not  all  the  same  in  the  4-neighbor  case.  Also,  the  sum  of  the  eigenvalues  can  be 
shown  to  be  the  same  in  both  cases.  Then,  since  the  harmonic  mean  of  a  collection  of 
unequal  numbers  is  necessarily  less  than  the  arithmetic  mean  of  that  same  collection 
of  numbers,  the  result  follows. 

4  Estimation 

The  three  parameters  can  be  estimated  by  maximum  likelihood  using  the  density 
in  (2.1).  The  following  is  specifically  for  the  4-neighbor  model  but  could  easily  be 
modified  for  any  neighborhood  structure. 

The  MLE  /2  of  n  is  the  sample  mean  x. 

Let  u?k  =  1  —  cos(27r fcj/n)  +  1  -  cos(27r/r2/n),  and  let  d  =  4.  Note  that 
Ak  =  82  +  (see  3.2). 

The  likelihood  equations  for  6 1  and  d2  are: 


E 

4wk 

=  n'JE  E  (*i  -  *i+j)2 

iei>j€*0 

(4.1) 

0  2  + 

"'J  E 

1 

=  n"2  E  (*1  ~^)2 

(4.2) 

6  2  +  #iwk 
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Theorem  1. 


The  RHS  of  (1.1)  is  equal  in  distribution  to  n~ 2  EkeL*  4a>icA£1x?-  The  RHS  of 
(1.2)  is  equal  in  distribution  to  n-2  Ek^o.o)  Aj^x'2. 

Proof 

The  RHS  of  (1.1)  is  a  quadratic  form,  so  it  can  be  expressed  as  XTBX,  where 

■Bo,0  =  -®i.i  =  4 

Boj  =  Ri,i+j  =  0  for  j  ^  N0 
=  1  for  j  €  No- 

Let  the  covariance  matrix  of  X  be  E,  and  let  the  matrix  with  the  eigenvectors 
oLE  be  denoted  PT.  Let  Y  =  PTBl'2X.  Then  YTY  =  XTBX ,  Y  is  a  vector  of 
independent,  mean  0  normal  random  variables,  and  has  variance  The 

result  follows  by  observing  that  a  x 2  random  variable  can  be  generated  by  squaring 
a  standard  normal  random  variable. 

To  find  the  distribution  of  the  RHS  of  (1.2),  let  Y  =  PT X  so  that  YTY  =  XTX, 
and  Y  ~  N (Pr/i,  A-1).  Therefore,  Ek*o,0  =  ^1X2(^n2/i)  +  Ek#o,o  ^kV-  Then, 
n~2  Ej6L2  (®i  -  x)2  =  n-2  Ei6L2  r2  -  x2  =  n~2  Ej6l2  V?  -  z2-  The  result  follows  by 
observing  that  y0,o  =  nr.  □ 

Now  let  dj  and  a2  be  defined  by: 
fli  =  lim„-oon“2EkeL2n4^k/Ak 

=  So  So  4(2  “  cos(2ttx)  -  cos(2:ry))/(02  +  0i(2  -  cos(2ttx)  -  cos(27ry)))  dxdy  < 
a2  =  limn_o0  n“2  EkeiJ  V^k 

=  Jo  So  (^2  +  (2  -  cos(2ttx)  -  cos(27ry)))-1  dxdy  <  oo. 

By  a  version  of  the  CLT,  the  RHS  of  1.1  is  AN(ai,<72),  where  cr2  — +  0.  Similarly, 
the  RHS  of  1.2  is  AN  with  mean  a2  and  a  variance  that  goes  to  zero  as  n  -4  oo. 

The  asymptotic  distribution  of  the  MLE’s  of  and  02  has  been  established  by 
first  noting  that  the  likelihood  can  be  written  in  terms  of  independent  random  vari¬ 
ables  by  using  Y  =  PTX,  where  PT  is  the  matrix  of  eigenvectors  of  the  covariance 
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matrix  of  X.  As  in  the  proof  of  theorem  1,  this  follows  because  Y  ~  N(PTfj.,  A-1), 
malting  Y  a  vector  of  n2  independent  normal  random  variables  with  different  vari¬ 
ances,  one  with  mean  n/i,  and  all  others  with  mean  0.  The  likelihood  is  then 
nk€Lgfk(*ki*)-  where  fk(*k,0)  =  (W(2«))1/aexp(-Akx£/2)  for  all  k  except  k  = 
(0,0),  since  y0,o  has  nonzero  mean,  with  density  (02/(27r))1/,2exp(— $2(20,0  —  np.)7/2). 

The  likelihood  equations  for  Y  are  the  same  as  those  for  X .  The  asymptotic 
normality  of  6  =  (8 1,  #2)  now  follows  from  theorem  2(iv)  of  Bradley  and  Gart  (1962). 

Specifically,  let  Cj,  c2,  and  c3  be  defined  by: 

2  Ci.  =  limn^00n-2Ek6L?1^k/Ak 

=  /o  /01( 2  —  cos(2ttx)  —  cos(2/ry))2/(02  +  #i(2  —  cos(2xx)  —  cos(27ry)))2  dxdy 
2c2  =  lim„^TO  rr2  EkeL*  wkAk 

=  Jo  Jo  (2  “  cos(2ttx)  —  cos(2~y))/(#2  +  #i(2  —  cos(2ttx)  -  cos(2~y)))2  dx  dy 
2c3  =  lim*^  n"2  £k6Lj  A£2 

=  fo  fo  (82  +  6>i(2  -  cos(2-x)  -  cos(2t ry))“2dxdy  . 

It  follows  that  Ci,C2,  and  c3  are  finite. 

Theorem  2. 

n(d  —  8)  —>  N(0,  I*1),  where  1= 

In  particular,  n(8i  —  61)  —>  N(0,  c3(cic3  —  c2)-1)  and 
n{82  -  82)  N(0,  ci(c!c3  -  c^)-1). 

Theorem  2  (iv)  of  Bradley  and  Gart  is  appropriate  when  the  number  of  pop¬ 
ulations  sampled  increases  as  the  sample  size  increases.  The  proof  will  consist  of 
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verifying  the  conditions  of  their  theorem.  In  order  to  conform  more  with  their  no¬ 
tation.  the  boldface  vector  notation  will  not  be  used  while  verifying  the  conditions. 
Since  the  data  is  a  single  observat'on  from  each  site  in  an  n  by  n  array,  the  sample 
size  is  n 2  instead  of  the  usual  n.  Assume  that  one  sample  is  taken  from  each  density 
f ,(x,,  #).  The  joint  likelihood  is  then  n?=i  f.(zt,  #)•  In  general,  it  is  not  necessary 
that  each  f,  depend  on  both  8 j  and  $2,  but  in  the  present  case  each  f,  does  depend 
on  both  6l  and  02. 

Let  ^  =  (81,82),  Q  —  (O.oo)  x  (0,co).  The  conditions  to  verify  are: 

1(a).  For  almost  all  x,  €  3?  and  for  all  8  6  Q,  31og fi/ddi,  d\ogfi/d82,  <92log/,7<9#2, 
32log f,/d6\,  32log fi/de.d^,  33log f{/d8\d87,  33log fi/dS.dd2  exist  for  i  =  1, . . . ,  n2. 


1(b).  For  all  f,-,  for  almost  all  x,-,  and  for  every  8  £  Q, 

\df,/d8x\ <  Ffl (*,•),  \8fi/d82\  <  F,-2(x,-),  |32/,/^|  <  Ftll(x,-),  |32/,-/302|  <  F,22(x,), 
I d'fi/ddM  <  F,-12(x,),  |33log/,/3d2<902|  <  H,n2(x,),  \d3logfJ 88:88- \  <  H:]22(x;). 
where  F,r(x,)  and  Ft>i(x1)  are  integrable  over  3 1  and  H,ri<(x,)ft-  dx,  <  M,, 
with  the  M.  finite  positive  constants,  for  i  =  1, ...  n2  —  1,  and  r,  s,  i  =  (1,  2). 


2(a).  For  the  sequence  of  density  functions  { f ,- } ,  YjJlj  jjj  f,  dx,  =  o(l), 
where  Daf  =  {|31og/,/36>r|  >  n2}  and  foai  (d\ogf,/d8r)2  f,  dx,  =  o(n4 ), 
where  D2l  =  {\d\ogfi/d6r\  <  n2}  for  r  =  1,2. 


2(b).  For  the  sequence  of  density  functions  { f , } ,  Xj,r=i  Jq  f,  dx,-  =  o(  1), 
where  D3,  =  { \d2\ogf,/d8Td8s\  >  n2},  JDf,  |32log/;/3^3^|  f,  dx,  =  o(n<), 

where  D4l  =  {\d2\ogf,/d8rd8s\  <  n2},  and 

limns^oc  n~2  Er=i  fst  \d2\og/i/d8rd8f\  f,  dx,  =  J  rs(8)  exists,  and  that  JTl(0)  be  posi¬ 
tive  definite  with  finite  determinant  for  r,s  —  1,  2. 
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2(c).  Referring  to  HlTSt  and  Mx  in  1(b).  Ev=i /d4>- =  °(1)>  where  Ds,  = 
{Hirsi  >  n2}  and  n~1  Mi  <  M  with  M  a  finite  positive  constant. 

3(a).  For  every  e  >  0,  lim,,-^,  n-2  /dt,  E?=i  (dlog/,-/d0T)*  fi  dx,  =  0,  where 
£>7.  =  {Er=i  (51og fi/d6rf  >  en2.} 


Verification  of  the  conditions  will  now  be  given. 

Recall  that  for  i  =4  (0,0),  f,(x,-,  0)  =  (A,/27r)I/,2exp(  — A,x2)u.',-, 

and  a.',-  =  2  —  cos(27721/n)  —  cos(2tt i2/n).  The  i  =  (0,0)  observation  has  non-zero 
mean  and  should  be  treated  separately  since  the  derivatives  needed  will  be  slightly 
different.  It  is  easily  verified  that  the  (0,0)  observation  poses  no  difficulty,  so  for 
brevity,  will  not  be  treated  here. 


1(a).  Clearly  these  derivatives  exist,  and  are  recorded  here  for  later  use. 

diogfi/ae,  =  (u,’,/2)(i/A,  -  X,2),  d\ogf{/de2  =  (i/2)(i/a,  -  x2), 

a2iog Mae*  =  — u,2/(2A2).  52iog fi/ael  =  -i/(2A2),  d^u/ae.ae,  =  -a.-,/(2A2), 

Fiogfi/aeiaBt  =  uf/Af,  Fiog/i/ae^e*  =  ^/a3 


1(b).  dfi/dBt  =  (A-1/2  -  A^)exp(~A,x2/2)/(2(2;r)V2) 

Since  0  <  a;,-  <  4,  and  #2  <  A,  <  #2  -f  4#i, 

|5iog/,/a^|  <  (62-1/2  -f  (62  +  4^)1/2)exp(-A,x2/2)/(2(2-)1/2)  =  F,2(x,),  which  is 
integrable  over  3b 

Since  dfi/88 j  =  u.’t<9/,/<902,  let  Ft](xt)  =  4F,-2(x,),  which  is  integrable  over  9?. 
Next, 

=  -(>r3,;  +  A-,/:  +  4XAT,/’  -  AF))exp(-A,*;>/2)/(2(2r)>/=) 

W!/,/a«,;|  <  (03'3'3  +  +  x;(ei'P  +  (9=  +  491)',?))exp(-Aix;/2)/(2(2x)'/=) 

=  F.„ 
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Since  d2  fu!dQ\  =  tu?32 let  Ftll  =  4F;22  and  clearly  both  F,n  and  Fl22  are 
integrable  over  3?. 

Similarly, 


d2fi/de1de7 
I  d2fi/delde2 


=  — a-'i(2_1  A~3/2-r 2-1Aj/2  +  (A,-1/2  -  A11/2)x?/2)exp(-A,r?/2)/(2(2 tt)V2) 
<  (023/2  4-  02"1/2  +  (021/2  +  (02  +  401)1/2)x?/2)exp(— A,x?/2)/(2t:)1/2 

=  f,12. 


And  F,i2  is  integrable  over  3?. 
Next, 


Flogfi/ddlddi  =  |d3log/,/302<902|  =  u;2/A?  <  25/03  =  ff,-112 
Fiogfi/ddiddl  =  \d3\og fi/de\de7\  =  w,-/a?  <  5/02  =  jyil22 

The  condition  KlTStfidxi  <  M;  is  satisfied  for  all  z  for  r,s,t  =  1,2  with  M;  = 

25/0j. 

2(a).  <91og/;/502  =  (1/A;  -  x2)/2  ,  so  limn_oc,  P;(D2i)  =  0,  where  P;(D2,-)  is  the 
measure  associated  with  the  random  variable  X{.  To  see  this,  note  that  \l/2x,  ~ 
N(0,1),  so  x2  ~  A;x2.  Therefore,  P;(|x?  -  1/A, |  >  2 n2)  =  P(|x2  -  1|  >  2(7z2)A.-)  < 
l/(2A2rz4)  <  l/(20^n4),  for  all  i,  by  Chebvshev’s  inequality. 

This  means  that  £F=i  /;dx;  <  n2(02  +  401)1/2/((2-)N:2^774)  ,  which  is  o(l). 
For  the  second  part  of  2(a),  note  that  ( dfi/d82 f  =  (x?  -  2x2/A,  +  l/A2)/4,  so 
;D,(a/;/a02)2/;dx;  <  (3/02  -f2-r  \/8\)/A  =  b,  for  all  z,  so  the  condition  holds, 
since  7i26  is  o(774  ). 

Since  d\ogfi / d8\  =  u:,d\ogft/d82.  these  same  conditions  can  be  verified  for 
d\ogfi/d8\  in  the  same  way. 


2(b).  For  r— 1  and  s  =2,  D3;  =  {|32log/;/501302|  >  tt2},  and  |32log/;/50i<902|  = 
u-’,/(2A2)  <  2/02,  so  hm„_00  P;(D3;)  =  0  .  Therefore,  for  n  >  2/02,  all  terms  in  the 
sum  are  zero,  so  the  sum  is  zero. 


14 


By  inspection  of  <92log/,/d#2  and  <92log f if  661,  similar  results  follow  for  each  of 
these. 

Since  d‘\ogfi/ 39\,  <92log/,/ 66\,  and  32log/,7d#i<9#2  are  all  bounded,  the  condi¬ 
tion,  /d4  (<52log fi/d8rdds)7  fidxi  =  o(n4),  is  easily  verified. 

Also,  <92log fi/66\,  <92log fifd6\.  and  S2log/7<3#id#2  are  all  constants  and 
limn_oo  P,(D4i)  =  1,  so  the  limit: 

limn—oo  mh  /d,  —  \d‘\ogfi/d6rd8s\  fidxi  =  Irs  exists  for  r.s  =  1,2.  The  three 
elements  of  I  were  identified  earlier  as  Cj,  c2,  and  c3,  with  2 ca  =  limn_00  n-2  tr2/A2, 

2c2  =  limn_oo  n-2  u.',-/A2,  2c3  =  limn_oo  n-2  E”=i  1  /A?. 

The  matrix  I  must  be  positive  definite,  so  it  is  necessary  that  c:c3  >  c2.  This 
follows  from  the  Caucy-Schwarz  inequality.  Also,  the  determinant  of  I  is  finite,  since 
Cj,  c2,  and  c3  are  all  finite. 

2(c).  It  has  been  shown  that  it  suffices  to  take  Ht-rs<  =  2b /8\  for  r.s.t  =  1,2, 
so  for  n  >  2b /6\,  P(D5!)  =  0.  Therefore,  the  conditions  involving  integrals  over 
D5t-  and  D6;-  hold.  Also,  it  follows  that  H,rjt  <  M,-  =  25/^^  so  ^at  ’n~2YL'f=  \  Ms  is 
bounded  by  2b /6\,  which  is  finite. 

3(a).  This  condition  may  be  verified  using  the  same  approach  as  in  2(a),  so  will 
not  be  repeated  here.  □ 

5  Goodness  of  Fit 

A  Gaussian  model  has  been  proposed  to  explain  the  stationary  distribution  of  the 
relative  frequency  of  allele  A1  in  the  Stepping-Stone  model.  The  simplifications 
achieved  justify  its  use,  provided  that  it  adequately  describes  the  data  and  makes 
predictions  that  can  be  tested. 

Two  predictions  of  the  Gaussian  model  that  can  be  tested  on  real  data  are 
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the  behavior  of  cov(A'o.A'i)  and  the  behavior  of  var(A’)  ,  where  X  is  the  sample 
average.  Equation  3.6  gives  the  covariance  between  Ao  and  Aj  and  it  can  be  shown 
that  var(A’)  -  l/(rr#2)-  These  two  predictions  have  been  compared  to  several  sets 
of  data  simulated  from  the  Stepping-Stone  model  described  in  section  1.  For  the 
simulated  data,  it  does  appear  that  var(A’)  «  l/(n2#2),  In  the  Gaus'-mr.  model, 
equation  3.6  implies  that  cov(Xo,Xi)  decreases  with  separation  between  0  and  1  at 
a  faster  rate  for  small  6^.  The  same  is  true  for  simulated  data  from  the  Stepping- 
Stone  model. 

Also,  any  test  of  multivariate  normality  can  be  applied  to  the  simulated  data. 
The  Handbook  of  Statistics,  Volume  1  has  a  few  tests  and  reference  to  others.  These 
tests  have  been  applied  on  the  marginal  distributions  from  200  observations  of  a  4  by 
4  array  at  steady  state:  probit  plots,  D’AgostinoT  D  (1971),  Shapiro  and  Wilk's  W 
(1965),  skewness,  kurtosis,  and  the  Box  and  Cox  transform  (Gnanadesikan,  1977). 
Bivariate  normality  was  checked  using  the  Box  and  Cox  transform.  Multivari¬ 
ate  normality  (the  joint  distributions  of  all  16  random  variables)  was  checked  using 
multivariate  tests  of  skewness,  kurtosis  (Mardia,  1970),  Malkovich  and  Afifi’s  (1973) 
generalization  of  W,  a  xle  probability  plot  of  the  Mahalanobis  distances,  and  the 
associated  Kolmogorov- Smirnoff  (I\S)  test.  Although  the  probability  plot  looked 
nearly  linear,  the  uniform  random  variable  associated  with  the  KS  test  was  .947. 
(Large  values  indicate  lack  of  fit.)  The  Box  and  Cox  transform  was  used  on  the  raw 
data,  say  p-u  for  i=  1,2,. . . ,  16,  and  on  pj/(l  —  p\).  X ith  the  raw  data,  it  was  best 
not  to  transform  and  with  pi/(l  -  p\),  the  log  transform  was  best.  Regarding  the 
marginals.  3  of  the  16  skewness  tests  rejected  and  5  of  the  16  kurtosis  tests  rejected. 
Because  pm  was  taken  to  be  .5,  skewness  is  not  expected  but  nonnormal  kurtosis  is 
a  possibility.  However,  with  200  observations  it  would  be  surprising  if  none  of  the 
tests  rejected  the  Gaussian  assumption  since  it  is  an  approximation  to  the  unknown 
distribution.  When  r\u  was  taken  to  be  .7,  log(p;/(l  -  pj))  has  a  skewness  of  about 
.4  and  a  kurtosis  of  about  3.4.  This  can  be  improved  by  using  log(p-x/(l  —  p() 
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with  A  =  1.5  or  2.  A  better  transform  appears  to  be  <£-1(pi)  or  where 

A  =  1.5  or  2.  With  p_\f  =  .7  and  using  <£-1(p?-5),  200  observations  from  a  5  by  5  ar¬ 
ray  appeared  to  be  approximately  MVN.  Multivariate  skewness  and  kurtosis  were 
both  within  acceptable  range,  the  Xjs  plot  °f  Mahalanobis  distances  looked  very 
good,  and  the  uniform  random  variable  associated  with  the  Kolmogorov- Smirnoff 
test  was  .0717.  Only  one  of  the  25  marginal  tests  of  skewness  and  one  of  the  25 
marginal  tests  of  kurtosis  were  rejected  at  the  .05  level.  All  of  the  above  tests  were 
with  m/ -j-m  >  1  to  make  the  number  of  occurrences  of  p\  =  1  small.  An  observation 
of  pi  =  1  was  changed  to  (A7  —  .5)/A7  where  N  is  the  number  per  colony.  Hereafter, 
it  is  assumed  that  mi  +  m  >  1. 

Assuming  that  the  Gaussian  Model  adequately  describes  the  data,  it  is  of  inter¬ 
est  to  estimate  and  8 2  from  simulated  data  and  make  predictions  about  the  effect 
of  changing  from  the  4-neighbor  to  the  full-neighbor  models.  Using  the  $-1(pb5) 
transform,  with  n  =  10,  mi  =  .1,  and  m  =  1.0,  1.5,  2.0,  2.5,  3.0,  8l  and  82 
were  estimated  by  maximum  likelihood  using  a  grid  search  to  find  good  start¬ 
ing  values,  followed  by  the  Newton- Raphson  technique.  The  estimated  values 
were  used  to  predict  the  variance  of  the  marginals  for  the  full-neighbor  model. 
Using  81/ 99  versus  $i/4,  for  m  =  1.0,  1.5,  2.0,  2.5,  3.0,  the  predicted  val¬ 
ues  for  the  variance  were  .99,  .73,  .57,  .47,  and  .41,  respectively.  As  expected, 
these  predictions  are  lower  than  those  observed  in  the  4-neighbor  case,  which  were 
1.39,  1.06,  .85,  .71,  and  .63,  respectively.  A  second  simulation  with  the  same 
values  of  m  but  using  the  full-neighbor  migration  pattern  produced  variances  of 
1.0,  .73,  .57,  .45,  and  .39  for  the  5  values  of  m.  These  are  in  good  agreement  with 
what  had  been  predicted. 


6  Summary 


A  Gaussian  model  has  been  fit  to  data  simulated  from  the  Stepping-Stone  Model 
used  in  population  genetics.  Standard  tests  of  multivariate  normality  on  the  trans¬ 
formed  data  suggest  that  the  fit  is  acceptable,  with  the  transform  or  <I>  — 1  (p1  "5 ) 

performing  well.  In  addition,  implications  of  the  model  do  appear  to  hold  for  the 
transformed  data. 

The  parameters  of  the  Gaussian  have  been  estimated  by  maximum  likelihood, 
and  the  asymptotic  distribution  of  the  maximum  likelihood  estimators  has  been 
established. 

The  previous  result  by  Kimura  and  Weiss  for  cov(A'o.  A])  has  been  derived  (for 
the  transformed  data)  without  recourse  to  recurrence  equations.  A  comparison  of 
the  Stepping-Stone  neighborhood  structure  with  the  other  extreme  neighborhood 
structure,  the  Island  model,  has  been  made  in  terms  of  var(A'j).  A  more  complete 
comparison  using  the  total  variation  distance  between  the  two  joint  distributions  is 
possible. 

Finally,  the  Gaussian  model  promises  to  have  other  applications.  For  example, 
the  residuals  from  a  regression  model  with  spatial  autocorrelation  could  be  assumed 
to  follow  the  Gaussian  model.  The  parameter  61  provides  a  natural  alternative  in 
the  hypothesis  testing  that  is  sometimes  used  in  that  context. 
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